Nature Genetics
Top medRxiv preprints most likely to be published in this journal, ranked by match strength.
Show abstract
Mapping the pleiotropic effect of genetic variation on biological processes and complex phenotypes is fundamental to extracting translational insight from genome-wide association studies (GWAS). Here we present The Human Genotype-Phenotype Map (GPMap), a repository of colocalizing genetic associations across 15,997 complex traits and 2.7 million molecular measurements, leveraging common and rare variants and cis-and trans-acting effects across disaggregated tissue types and single cell datasets ...
Show abstract
Genome-wide association studies (GWAS) have implicated tens of thousands of genetic variants associated with complex traits and polygenic diseases. Colocalizing GWAS variants with variants that may regulate gene expression, via expression quantitative trait loci (eQTL) mapping, has successfully led to the identification of disease-critical genes and their cell types of action. Recent studies predominantly colocalize proximal cis-eQTLs, which are estimated to regulate [~]10% of variance in gene e...
Show abstract
Methods that analyze single-cell RNA-seq+ATAC-seq multiome data have shown promise in linking enhancers to target genes by correlating chromatin accessibility with gene expression across cells. However, correlations among ATAC-seq peaks may induce non-causal tagging peak-gene links (analogous to tagging associations in GWAS); indeed, we confirm that tagging effects induced by peak co-accessibility are pervasive in peak-gene linking. We defined two scores for each ATAC-seq peak: co-accessibility ...
Show abstract
BackgroundAttention-deficit/hyperactivity disorder (ADHD) is a common heritable neurodevelopmental disorder, affecting [~]7 million children (11.4%) in the U.S. However, ADHDs underlying genetic architecture remains largely unknown. Transcriptome-wide association studies (TWAS), which integrate expression quantitative trait loci (eQTL) and GWAS summary data, can identify differentially expressed risk genes underlying complex phenotypes. Here we conduct a TWAS of ADHD using expression data from m...
Show abstract
Sinonasal squamous cell carcinoma (SNSCC) is an aggressive head and neck cancer of the sinonasal cavity which has not benefitted from therapeutic advances over decades1. Though historically attributed to inhaled carcinogens such as hardwood dust and tobacco smoking2, SNSCC is incidentally associated with human papillomavirus (HPV)3,4. Importantly, HPV is the primary oncogenic driver of >80% of anatomically adjacent oropharyngeal cancers5. While viral status drives clinical staging and treatment ...
Show abstract
Structural variants (SVs) are a major source of genomic diversity and disease susceptibility; however, populations from the Middle East and North Africa (MENA) region remain critically underrepresented in global reference databases. We provide the first detailed catalogue of structural variation in 61 individuals from diverse MENA countries, using publicly available ultra-long Oxford Nanopore sequencing. A scalable and dual-reference alignment-based method (GRCh38 and T2T-CHM13) was employed to ...
Show abstract
Mitochondria are semi-autonomous organelles whose generation and maintenance demand precise expression, processing, and assembly of >1,000 proteins encoded across two genomes. To explore this cooperativity, we performed multiomic analyses on >200 cell lines harboring mitochondrial gene perturbations, generating >26M molecular measurements. Our data reveal that mitochondrial proteome homeostasis is heavily influenced by post-transcriptional processes. Through nearest neighbor analyses, we reveal ...
Show abstract
Cohesin is a fundamental genome-organizing complex that orchestrates three-dimensional chromosome folding and gene expression via DNA loop extrusion. Alterations to genes encoding cohesin subunits and cohesin loaders cause Mendelian disorders, including Cornelia de Lange syndrome (CdLS). By contrast, disruption of factors that remove cohesin from DNA, including WAPL and its binding partners PDS5A and PDS5B, have not yet been associated with human disease. Here, we explored the relevance of these...
Show abstract
Autism spectrum disorder (ASD; MIM 209850) is reported to vary globally from 0.01% in East Asian populations to 4.36% in certain Australian cohorts. Despite high heritability estimates (61-94%), the genetic architecture underlying ASD susceptibility remains poorly characterized across diverse populations, as most genomic studies have initially focused on individuals of European ancestry. To investigate ancestry-specific genetic contributions to ASD, we analyzed whole-genome sequencing data from ...
Show abstract
Low-frequency variants (LFVs), defined by minor allele frequencies (MAF) of 1-5%, occupy the gap between common and rare variants in both frequency and effect size. The conventional genome-wide association study (GWAS) significance threshold (5x10-) is overly conservative for LFVs, which account for more than 25% of variants in GWAS. This limitation may obscure meaningful associations in highly heritable yet genetically complex disorders such as autism spectrum disorder (ASD). We hypothesize tha...
Show abstract
BackgroundMost rare coding variants in monogenic disease genes remain classified as Variants of Uncertain Significance (VUS), limiting their use in clinical care. Many variant classifications have been submitted to ClinVar, often with rich free-text summaries of the evidence underlying each classification. These narratives are not standardized and are difficult to mine systematically, making it challenging to identify variants that might be reclassified as new evidence becomes available. Method...
Show abstract
Biobanks with longitudinal measurements have advanced our understanding of time-to-event (TTE) traits including age-of-onset and disease progression. However, limited work has characterized the heritability of TTE traits, a key parameter for comparisons of total association and predictive power. Here, we present COXMM, a Cox proportional hazard mixed model for estimating TTE heritability. Simulations show our model achieves nearly unbiased results, whereas non-TTE approaches severely underestima...
Show abstract
Chromosome 5p15.33 harbors several independent association signals which demonstrate antagonistic pleiotropy across cancer types, with causal mechanisms largely unresolved. To identify functional variants and enhancer elements at this locus, we performed statistical fine-mapping followed by massively parallel reporter assays (MPRA) and proliferation based CRISPRi screens. This approach identified eight multi-cancer functional variants (MCFVs) across three GWAS signals. Targeting rs421629 (part o...
Show abstract
Both short and long sleep duration have been associated with poor glycemic control and an increased risk of developing type 2 diabetes mellitus. Although sleep duration may differentially modify the effects of genetic risk factors for type 2 diabetes, this has not been systematically investigated. In the present study, we conducted genome-wide gene by sleep duration meta-analyses, separately assessing interactions of short and long sleep, for fasting glucose, fasting insulin, and hemoglobin A1c ...
Show abstract
LDB1 encodes transcriptional regulator protein LIM-domain-binding protein 1, which plays an important role in neurogenesis. Few C-terminal likely gene disrupting (LGD) variants have been reported in the literature in individuals with congenital ventriculomegaly. Through international collaboration, we now assembled a cohort of 16 individuals with de novo variants affecting various regions of LDB1. Eleven variants affect either the whole gene or the N-terminal dimerization domain (including gene ...
Show abstract
Rare Mendelian disorders affect 300-400 million people globally. Although genetic testing has become widely adopted, gene-specific evidence for tailored variant interpretation remains scattered across resources. We present Gene Portals, a framework for gene-centered multimodal knowledge bases that co-localize expert-harmonized clinical data, functional assays, population variation, structural annotations and gene-specific ACMG/AMP specifications within a single resource. A modular interface inte...
Show abstract
ObjectiveTo identify risk loci for Fuchs endothelial corneal dystrophy (FECD) and improve a genetic risk prediction model. DesignGenome-wide association study (GWAS), polygenic risk score (PRS) construction, and TCF4 CTG18.1 short tandem repeat (STR) length inference. ParticipantsThe study included 7,316 Europeans (EUR) with FECD or related corneal dystrophy phenotypes and 1,588,467 controls from the UK Biobank, All of Us, FinnGen, and the Million Veteran Program. Two independent EUR FECD coho...
Show abstract
Human genetics has become a cornerstone of drug target discovery, yet the value of Mendelian randomization (MR) for predicting clinical success remains uncertain. Here, we systematically evaluated MR across 11,482 target-indication pairs with documented Phase II clinical outcomes to assess its utility for drug development. We find that MR statistical significance alone does not enrich for Phase II success, in contrast to genome-wide association study (GWAS) support, which confers an increase in ...
Show abstract
Age-related hearing loss (ARHL) is a progressive, bilateral decline in hearing ability that affects one in four individuals over 60 years of age worldwide. While previous genome-wide association studies (GWAS) have identified distinct single-nucleotide variants (SNVs) associated with metabolic and sensory ARHL phenotypes, the contribution of short tandem repeats (STRs) - a neglected yet important class of genetic variants - remains poorly understood. To address this gap, TRTools was used to impu...
Show abstract
Body mass index (BMI), type 2 diabetes (T2D) and associated cardiometabolic features modify Alzheimers disease (AD) risk, yet shared mechanisms remain poorly understood. Using sex- and age-stratified genotyping data for BMI and T2D, we investigate how these traits converge on shared genetic pathways to AD risk. Employing multi-trait, machine learning and single-cell transcriptomics, we identify sex-specific cardiometabolic liability linked to higher BMI-associated risk in women and T2D-driven ri...